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PREFACE 
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scientist for Lockheed Engineering and Management Services Company, Inc., 
performed this research for the Earth Observations Division, Space and Life 
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1. INTRODUCTION 


Racently, considerable Interest has been shown In the development of 
techniques for the classification of Imagery data such as remote sensing data 
obtained using the multi spectral scanner (MSS) on board the Landsat. 

Classification of multichannel Imagery data Is typically done by applying a 
decision rule to each resolution element or picture element (pixel) and 

classifying It based on spectral information. This procedure ignores spatial J 

Information. Most of the imagery data contain much spatial info'miation which 
can be used to improve computer-assisted classification. 

The use of contextual Information in pattern classification has attracted the 
attention of many researchers, mainly In the area of character recognition 
(refs. 1, 2). Generally, one of two basic approaches has been used, the table 
lookup method or the Markov approach. The table lookup method is based on the 
assumption that every word to be recognized Is selected from a known finite 
table. A word Is classified by comparing It with every word of the same 
length In the table and finding the best match. 

The Markov approach 1s based on the assumption that the true category of a 
character is related in a probabilistic manner to the true categories of a 
small number of surrounding characters. Its use requires the estimation of 
the probability of occurrence of all possible pairs, triplets of characters, 
etc., from the sample text. Abend (ref. 3) derived optimal procedures when a 
Markov dependence exists between the states of nature of neighboring 

characters, and Raviv (ref. 4) gives the results of applying such procedures > 

for the recognition of English text. 

The use of contextual information in speech recognition is considered by Alter 
(ref. 5). Chow (ref. 6), using a nearest neighbor dependence method, obtained 
the structure and parameters of a recognition network for patterns represented 
by binary matrices. 

Several researchers attempted to use spatial information in the classification 
of Imagery data. Kettig and Landgrebe (ref. 7) developed a technique called 



/ 


Extraction and Classification of Homogeneous Objects (ECHO), which segments a 
scene Into homogeneous objects and uses sample classification to assign each 
object as a whole, rather than by Its individual pixels. Haralick et al . 

(ref. 8} used textural features based on gray*tone spatial dependence matrices 
to characterize a local scene texture and experimentally showed them to be 
useful for classification purposes. Swain (ref. 9) developed a cascade model 
for classifying a pattern based on multiple observations In a time-varying 
environment. Welch and Salter (ref. 10) presented a method for the contextual 
classification of imagery data. Chittlneni (ref. 11) discusses the use of 
context with linear classifiers. Tous*;^nt (ref. 12) gives a brief review of 
the use of context In pattern recognition and presents an extensive 11st of 
references on the subject. 

All of the approaches proposed In the literature either use arbitrarily 
selected transition probabilities or estimate them from a sample and treat 
them as global. For Imagery data such as those obtained In remote sensing, 
the transition probabilities very often not only vanr from one Image to the 
other but also vary from one local neighborhood to the other In the same 
Image. It Is difficult to obtain global estimates of transition probabilities 
because of the varying nature of Imagery and the nonavailability of true 
classes of pixels of Images. 

It Is the purpose of this paper to develop methods for locally estimating 
transition probabilities and to use these estimates In contextual classifica- 
tion. It Is assumed that the classifier is trained on representative data 
from the Image and, for every pixel of the Image, the a posteriori probabili- 
ties of the classes are estimated from spectral Information. Thus, the incor- 
poration of contextual Information Into classification is treated as a 
postprocessing operation. 

The nunber of transition probabilities to be estimated Increases as the square 
of the number of classes. Mathematical expressions for contextual classifica- 
tion become complex with the Increase in the size of the local neighborhood. 
Thus, making the estimation of transition probabilities Is computationally 
expensive. In this paper, the transition probabilities are modeled In terms 


of a slnglt paramttsr d, undtr reasonable assumptions, and tnethods are 
developed for the estimation of e. The estimated 9 is then used for the 
incorporation of spatial Information into classification. 

The paper 1$ organized as follows. Models for transition probabilities in 
terms of a single parameter 9 are developed in section 2. Techniques for 
locally estimating the parameter 9 of transition probabilities using the 
maximum likelihood method are developed in section 3. Section 4 presents 
expressions for using the contextual information in classification. Section 5 
presents the results of contextual classification of remotely sensed agricul- 
tural imagery data, using techniques developed in the paper. Conclusions are 
presented in section 6. Appendix A presents an extension of spatially uniform 
context to large neighborhoods. In appendix 3, expressions are developed for 
estimating transition probabilities for a two-class, three-sequent ial- 
neighborhood case without using models. The results of estimating the param- 
eters of transition probabil ities under different transition probabilities 
models in different directions in the local neighborhood are presented in 
arpendix C. Appendix 0 presents a multitemporal interpretation of the tech- 
niques developed in the paper for applications such as in remote sensing. 


3 


2. MODELING TRANSITION PROBABILITIES 


Th« modtls for th« transition probabilities of the classes of the neighboring 
pixels, In terms of a single parameter 6, are developed under reasonable 
assumptions. Let 1 and J be the neighboring pixels as shown in figure 2*1. 


1 J 


X1 

“*1 


X4 

J 

“•j 


Figure 2-1. • Neighboring pixels 1 and j. 


Let and Xj be the pattern vectors and and uj be the labels (classes) of 
pixels 1 and j, respectively. Let and uj take values of 1, 2, •••, M; 
where M is the number of classes. 


A linear model describing the dependency between the neighboring pixels in 
terms of a single parameter 9 for different r and s is given In equation (1). 

P(m.| " rUj • s) • (1 - 9)P(u»^ * ) 

P(m^ ■ r|(iij ■ r) ■ (I - 9)P(iu^ • r) + 9 / (1) 

0 < 9 < 1 / 


For 9 ■ 0, equation (1) becomes 

P(m^ • r|(i>j ■ s) • P(tt>^ • r) I 

> 

and P(w.j • rjwj ■ r) ■ P(u^ ■ r) ) 


( 2 ) 


Equation (2) Is the case where the labels of neighboring pixels are 
Independent. For 9 ■ I, equation (1) becomes 

P(«.j ■ rjwj • s) ■ 0 

P(w^ ■ r|uj • r) ■ 1 





and 


( 3 ) 



Equation (3) 1$ tha cast whart tha labals of tha nalghbor1ng pixels are 
complataly dapandant. Notice that tha llnaar transition probabilities model 
of aquation (1) Is a llnaar Interpolation In terms of a single parameter a 
bet»<aai tha extremes of equations (2) and (3). It can be easily shown that 
the modwl of equation (1) satisfies tha postulates of probabilities. That Is, 

0 < P(j»^ • r|uj • s) < 1 1 



Using a quadratic Intarpolatlon between the extremes of equations (2) and (3), 
a quadratic modal describing tha dependencies between the labels of 
neighboring pixels can be written In terms of a single parameter a as 

P(«^ ■ r|wj ■ s) ■ (1 - a)^P(u»^ ■ r) \ 

P(<i»^ ■ r|wj • r) • (I - a)^P(u)^ « r) + 3(2 • 9) > (5) 

0 < e < 1 ) 

Tha modal of equation (5) also satisfies the postulates of probabilities. 

However, it is to be noted that the dependencies between the neighboring 
pixels can be modeled through some other parameter. For example, by replacing 

9 with , the dependencies are described in terms of a, 0 < a < •; by 

e"* 

replacing 9 with — ~ the dependencies are described In terms oi 3, 

1 r 

^ < s < • . 


Tha transition probabilities between the classes of the neighboring cirels i 
and j also can be modeled to satisfy the following characteristics of 
dependencies, resulting in a nonlinear model. Some of the general 
characteristics of dependencies between neighboring pixels 1 and i can be 
written as follows. 


r 






m 






a. If the label ■ r of pixel i frecuently occurs concurrently with the 
label (tfj » s of pixel j, then 

P((u^ ■ r|(ttj * s) > P(it»^ ■ r) (6) 

and. If they always occur concurrently, then 

P(<i»^ • r|uj » s) » 1 (7) 

b. If the label u-j » r of pixel 1 rarely occurs concurrently with the label 
Uj » s of pixel j, then 

P(u^ ■ r|a»j » s) < P(tt»^ ■ r) (8) 

and. If they never occur concurrently, then 

P(u. « rjuj. » s) »'0 (9) 


c. If the label » r of pixel i occurs independently of the label uij » s 
of pixel j, then 


P((u^ » r|(i»j *.s) • P(w^ » r) 


( 10 ) 


A model satisfying characteristics a, b, and c can be written.- in terms of a 
single parameter 0 for different r and s as 


P(<u, 


P((U, 


(1 - 9)P(w; * r) \ 

■■'“J • ■ r - -— )• > • s) 

P(u. » r) 

■•'“J ■ '■> ■ (r- - 9) ♦ . r) 


< 9 < 1 


( 11 ) 


It can be easily shown that the transition probabilitier described by 
equation (11) satisfy the postulates of probabilities. Also, notice that 
requirements a, b, and c on the transition probabilities correspond to the 
cases where 0 < 0, 9 > 0, and 9*0, respectively. The model of equation (11) 
is referred to in this paper as the nonlinear transition probabilities model. 
Figure Z-Z illustrates the linear, quadratic, and nonlinear transition 
probabilities models. 






Figure 2-2.- Illustration of spatial dependency models. 


In the remainder of the paper, only the linear model of equation (1) and the 
nonlinear model of equation (11) are considered. 



3. LOCAL NEIGHBORHOOD ESTIMATION OF TRANSITION PROBABILITIES 





I 

I 




In this section, techniques are developed for the estimation of transition 
probabilities In the local neighborhood of the pixel under consideration for 
use In Its contextual classification. The criterion used for their estimation 
is the likelihood function. That Is, the transition probabilities are esti- 
mated as those that maximize the likelihood function of observed spectral 
vectors. If their spatial relationships are as given in the local neighborhood. 


3.1 A GENERAL EXPRESSION FOR THE LIKELIHOOD FUNCTION 

An expression for the likelihood function of N patterns in a general local 
neighborhood is developed In the following. Let X^, X 2 , •••, X(^ be the 
patterns in the general neighborhood. The likelihood function of these 
patterns can be wr-Itten as 


L‘ - p(Xi,X2,--,Xm) 


* Z) £ ••• - i,; » i,; •••; X»,u>» » ij.) 

1j-l <2*1 i. I L C C ^ 

MM M 

*2 Z) ^ p(Xi ,X-,***,Xn,Iu, “ i, ,iu- * i-, •••,<»)« * i«) 

i,»l 1„»1 i„-l ^ ^ n L i c c n 

16 , N 


P(uj^ ■ i^,u»2 “ * ’n^ 


(12) 


where M is the numoer of classes. In the following, it is assumed that 
(a) the probability density function of a pattern, given its label, is 
independent of other patterns and their labels and (b) the labels of the 
patterns are independent of the labels of their nonneighbors. By repeatedly 
using assumption (a), the following is obtained. Consider 


• PiXjlXj,*.. ,X|^i • f|y)p(X2,***iX|y|i.^ ■ , ••• ,U|i( • 

• p(Xji«^ • i^)p(X2|X3,...,X^; 1.3 ■ 

• p<X^|«l • i|)p(X2|(*2 • IjjpiXj.'.'.Xjjiu^ • • 1j,) 

• p(Xj|.j . 1j) 

> 1 ^ 


( 13 ) 
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Using equation (13} in equation (12) results In 

L' . f f: — f: fn 


£ Si ^ n p(x,i», ■ i|) PK ■ (..••-.(fc • 1, 

1^-1 ig-i 1^1 U*i J J J J 1 i ^ ■ 


Since U p(XJ is independent of the transition probabilities, dividing 

Lj-1 J 

both sides of equation (14) by it yields the criterion L to be used for 
estimating the transition probabilities. That is, 

, ^ ^ TA p^“i * Ill’ll)! , 

i^-1 12«1 if,-l U-1 ^ j ^ -N N 


P(«l ■ ij^,***,u)ni * ij^) depends on the particular local neighborhood and will 
be considered in detail in the following. 

3.2 SPATIALLY UNIFORM CONTEXT - FOUR NEIGHBORS 

The pixel under consideration, pixel 0, and its four neighbors in a two- 
dinwnsidnal local neighborhood are shown .in figure 3-1. 



1 


X4,(i»4 

4 

Xo»“0 

0 

X2,a>2 

2 


X3»“3 

3 



I I I I 

Figure 3-1.- Four neighbors of pixel 0. 


By repeatedly using assumption (b), equation (16) is obtained. From 
equations (15) and (16), an expression for L for the local neighborhood of 
figure 3-1 is obtained, as shown in equation (17). 


f 



1 ^ 


3.2.1 AN EXPRESSION FOR L WITH LINEAR TRANSITION PROBABILITIES MODEL 

Since a priori probabilities are position independent, when a linear model oi‘ 
equation (1) is used for transition probabilities in equation (17), L becomes 

L . t p(.o ■ IoIXq) n "<“j • ■ 'oi 

Uj*'o 


p(«»i • IqIx.) 

^ Prij - fo) “ ^ o '“0 * ^ 0 ^ 


^ p(u» « inlXJ j n (1 - 9) Z p(“ * ’ ® 

1 n»l Mm L ^ 


p(<»* * 


• 'o^ Jj 


- ^ p(o » IqIXq) { n (1-9) 
1M ^ (M L 


P(« * IqI^j) 


* (1- 9)^ + 6(1 - 9)^A + 9^(1 - 9)^B + 9^(1 - 9)C + 9^0 


where 


^ p(<i) • 1ft l^n) 


p(u * igjx^) + p(m » IglX^)] 


M p(w » 1nl^n) 

B » 2 — * Cp((o * i(,jX,)p((i) » iftl^2^ * 1ftl!1l)p{« “ 

1^1 P^(u « 1 g) ^ ^ ° ^ ^ ^ ^ ^ 

+ p(« » iQlX^)p((« » ig|X^) + p(u » ig|X2)p(« » igiXg) 

+ p(m « ig(X2)p(u • iglX^) + p(u» » iglX3)p(o) * iglX^)] 


// 



M d((U ■ InlXn) 

C . 2 -3 ^ Cp(“ ■ • 'ol*3> 

1j.l P (» • i(j) 

+ p(u « iQlXj^)p(w ■ 1 q|X2)p(o» * 

+ p(u» « iQ|X^)p(u> - iQlX3)p(u « 1 q!X^) 

+ p(u ■ 1 q|X2)p(u * iQiX3)p{(i» » 


- ? . I . ' .q1 [p(u a i^jX^)p((.> » iQ|X2)p('o» » ig]X3)p(u « ig|X^)] 

Iq.1 P (u. - ig) 


3.2.2 AN EXPRESSION FOR L WITH NONLINE.AR TRANSITION PROBABILITIES MODEL 

Using tha nonlinear transition probabilities model of epuation (11) in 
equation (18) gives the following expression for L. 



SEQUENTIAL NEIGHBORHOOD - GENERAL CASE 


A general N-pixel sequential neighborhood is shown in figure 3-2. 



Figure 3-2.- A general N-pixel sequential neighborhood. 
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The transitions for which the transition probabilities are applied In the 
sequential neighborhood are Indicated In figure 3-2. An expression Is 
developed here for the likelihood function L of the patterns In a general 
sequential neighborhood of N pixels. Only the pixels immediately adjacent to 
each pixel are treated as its neighbors. Consider 


P(Uj • 1 j,*** ,U|y| « 1,^) 

- P((i»2 ■ l2)P(<“x ■ ■ h»***»*^ ' * ^2^ 

• f(“2 " * h^“2 ' ‘2’“3 * ’3’“**“N “ ■ ’3’ 

• 'nI“2 • ^2^ 

- P(«i2 • l2)P(“i ■ ^il“2 ■ " h***‘*“M * ^nI“2 * ’2^ 


( 20 ) 


Assumption (b) Is used In obtaining equation (20). The Bayes rule Is used to 
obtain the following. 


P(o) 


3 * ’3’“4 


.U. 


*nI“2 


t?) 


P(tU2 


‘ ’2’’ 






(21) 


Proceeding In a manner similar to equation (20), the numerator of 
equation (21) can be written as 


P (“2 * ^ 2*"3 * ^’***’“n “ 


» P(<«3 - i3)P(«2 * ’2l“3 • ^'3)P(“4 * “ '’nI“3 “ ^3^ ^^2) 

Continuing in a similar manner obtains the following result. 

'’(“N-l ■ ■ '’(“N-l • ■ 'n-i) «3) 

The following Is obtained from the Bayes rule. 

P((j»2 » 1^) 

P(«2 • l 2 l «3 » I3) - ] ) P (‘«3 • 131^-2 ’ ^2^ 


Equation (25) Is obtained from equations (20) through (24). 




P(tOj - 1 j,4»2 • »“n ■ 


■ P{w^ • 1j^)P(<*»2 “ ^ 2 '“! * “ ^3^“2 " " ^mK-I * ^N-1^ 

(25) 


Substitution of equation (25) Into equation (15) results 1n an expression for 
the criterion L for a general sequential neighborhood. That is, 



3.3.1 THE LIKELIHOOD FUNCTION L OF PATTERNS IN A SEQUENTIAL NEIGHBORHOOD 
WITH THE LINEAR TRANSITION PROBABILITIES MODEL 

In this section, equation (26) is expressed in a polynomial form in terms of 9 
for a four-pixel sequential neighborhood, using the linear transition 
probabilities model of equation (1). The four*pixel sequential neighborhood 
considered Is shown in figure 3-3. 



Figure 3-3.- A four-pixel sequential neighborhood. 


The likelihood function of equation (26) for the neighborhood of figure 3-3 
becomes 



f 





P («2 ■ tjlXjjjd - 9)’ ♦ 9(1 - 9)2 a , 5 ♦ 


2 IX 3 ) 


•*• 0^(1 - 9) 


1^*345 * *45 Pi«3 • I 2 ) P^coj » 12)I^IW4 * * 2 ) 
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Since the e priori probabilities are position independent in the local 
neighborhood, the different quantities in equation (27) can be shown to be 
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Using a linear model of equation (1} for the transition probabilities and the 
definitions in equation (28) in equation (26), expressions for the likelihood 
function for different sizes of sequential neighborhoods can be easily written 
and are listed in table 3>1. 


3.3.2 EXPRESSIONS FOR THE LIKELIHOOD FUNCTION L OF A SEQUENTIAL NEIGHBORHOOD 
WITH NONLINEAR TRANSITION PROBABILITIES MODEL 

Using the nonlinear transition probabilities model of equation (11) in 
equation (26), expressions for the likelihood function for several sequential 
neighborhoods can be easily derived. These are illustrated for three-, four-, 
and five-pixel sequential neighborhoods in the following expressions. In 
order for the transition probabilities model to hold true, the transitions in 
the neighborhood must be as indicated in figures 3-4, 3-5, and 3-6. Define 
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The likelihood function for the three-pixel sequential neighborhood of 
figure 3-4 is given by 


1.3(e) • p(u, - 131X3)045(13,9) 


EXPRESSIONS FOR TIC LIKELIHOOD FUNCTION FOR DIFFERENT SIZES OF SEQUENTIAL NEIGHBORHOODS 
WITH VIC LINEAR TRANSITION PROBABILITIES HODEL 
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Th« IfVillhood function for tht four-pixol soqutntlal ntiqhborhood of 
figuro 3-5 Is givtn by 

l4(*) • ^ p{»* • 12|X2)>345(<2»*^ 
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Tho llkollhood function for tbi f1v«-p1x«1 sea>.ent1al neighborhood of 
figure 3-6 Is given by 
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Figure 3-5,- A four-pixel sequential neighborhood. 
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Figure 3-6.- A five-pixel sequential neighborhood. 


3.4 COMPUTATION OF 6 BY TH£ MAXIMIZATION OF LIKELIHOOD FUNCTION 

With both linear and nonlinear transition probabilities models, the likelihood 
function Is a continuous function of the parameter 8. The parameter 9 that 
maximizes the likelihood function with the nonlinear transition probabilities 
model can be obtained using a one-d1mens1ona1 bounded search, sinc; riie 
parameter 9 Is bounded and the likelihood function Is nonlinear. W1«.h the 
linear transition probabilities model, the likelihood function Is a polynomial 
In the parameter 8. The flow diagram (refs. 13-16) of figure :i-7 can be used 
to find the optimal « (6 opt) transition probabll itles model in 

the range 0 < 8 < 1, which gives the global maximum for the likelihood 
function. 







Figur# 3-7.- Procedure for finding 9gp^ In the range 0 < 9 < 1, 
which gives globe! maximun for L^S). 








Optimal transition probabilities that maximize the likelihood function for 
some typical sequential neighborhoods, with both linear and nonlinear 
traraition probabilities models, are given in appendix C. 
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4. UPDATING A POSTERIORI PROBABILITIES 

Using the transition probabilities models of section 2, methods are developed 
In this section for Incorporating contextual information into the classifier 
decision process. 

4.1 UPDATING THE A POSTERIORI PROBABILITIES OF A PIXEL USING INFORMATION 

FROH A sIHGLT'Nei'MiT 

Expressions are developed for updating the a posteriori probabilities of the 
labels of a pixel using information from its single neighbor. These are used 
to exploit contextual information from large local neighborhoods. Let the 
pixel under consideration be X„ and its neighbors be X^,_^ and X^^p 
Figure 4-1 shows the positions of these pixels 


^n-1* 

“n-1 

^n‘“n 

^n+l» 

“n+1 

1 

L- 

♦ -■ 


Figure 4-1.- Illustration of pixel n‘ under consideration 
and its neighbors. 

The assumptions used for updating the a posteriori probabilities are the same 
as those made in section 3. Namely: (a) The probability density function of 

a pattern, given its label, is independent of other patterns and their labels; 
(b) The labels of the pixels are independent of the labels of their 
nonneighbors . These assumptions are used in the rest of the section. The 
information contained in the pattern regarding the label of the pattern 
X„ can be written in terms of transition probabilities as 

'K • • "'Vl • 

■ " Mvi ■ 'IPlVl ' 

Similarly, the following is obtained. 





(33) 
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Now, the a posteriori probabilities of the labels of the pattern Xj, are 
updated using the information fran the patterns and Xp_j and their spatial 
relationship as follows, using the assumptions (a) and (b) above. 
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Using the linear transition probabilities model of equation (1) in 
equation (35) yields 
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The information in the pattern X„, in obtaining the label of pattern 
can be ritten as follows. 
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similarly, the following is obtained. 
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Using the patterns and X^+j. one has 
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Using the linear transition probabilities model of equation (1) in 
equation (39) yields the following. 


p(“n “ (1 - 9) + 0 


p(Vi • 


Vl ' 


p<“n ■ • r H P(vi-Jl^^p 1 

(1 - 6) * 9 £ p(„ . j) P(“„ ■ jl<„) 

j*i n+i. 

4.2 USE OF SINGLE-NEIGHBOR UPDATING EqUATIQNS FOR LARGE LOCAL NEIGHBORHOODS 

This section shows how single-neighbor updating equations can be used 
repeatedly to exploit spatial information in large local neighborhoods. 


> 9 ^ 


4.2.1 SPATIALLY UNIFORM CONTEXT - FOUR NEIGHBORS 

Consider the pixel under consideration, pixel 0» and its neighbors in tne 
local neighborhood shown in figure 3-1. In this section, express are 
developed for obtaining the a posteriori probabilities of the classes of 
pixel 0, using information from its local neighborhood. Consider 
equation (41), where f » p(Xg,Xj,«-,X^). Using equations (13). (16). 
and (17) in equation (41) yields equation (42). 

From equations (39) and (42), the following is easily understood. Updating 
the a posteriori probabilities of the classes of pixel 0, using information 
frwn its neighbors as shown in figure 3-1, is equivalent to using the single- 
neighbor updating equation (39) repeatedly, taking one neighbor at a time. 
The sequence in which the neighbors are used is’ immaterial . 

4.2.2 SEQUENTIAL NEIGHBORHOOD GENERAL CASE 

■This section considers the problen of updating the a posteriori probabilities 
of the classes of the pixel under consideration, pixel j, in a general 
sequential neighborhood. The location of pixel j in a general sequential 
neighborhood of N pixels is shown in figure 4-2. 

1 2 3 ••• j ••• N - 1 N 

— ^ ^ ^ - ■■ -- 

Figure 4-2.- The pixel under consideration, pixel j, and its general 

sequential neighborhood. 

The transitions for which the estimated transition probabilities apply in 
the whole sequential neighborhood are indicated in figure 4-2. Consider 
equation (43) where NU(ij) is the numerator of the first expression in 
equation (43). Using equations (13) and (25) in the numerator of 
equation (43) results in equation (44). 


If the numerator and denominator of equation (43) are divided by n P(X,) . 

Lz-l ^ . 

the numerator of equation (43) can then be written as shown in equation (45). 
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The term in the first set of brackets of equation (45) is the contribution 
from pixels to the left of pixel j (see fig. 4-2), the term in the second set 
of brackets is the contribution from pixels to the right of pixel j, and the 
first term is the contribution from pixel j to the a posteriori probabilities 
of the classes of pixel j. These contributions appear in multiplicative form 
in equation (45). 

An examination of equations (35), (39), and (45) reveals that the single- 
neighbor updating equations (35) and (39) can be used repeatedly to update the 
a posteriori probabilities of the classes of pixel j, using information from 
its sequential neighborhood as follows. Equation (39) is used to update the 
a posteriori probabilities of pixel (N - 1), using the a posteriori 
probabilities of pixels (N - 1) and N. The updated a posteriori probabilities 
of pixel (N - 1) and the a posteriori probabilities of pixel (N - 2) are used 
to update those of pixel (N - 2) . Proceeding in a similar manner, the updated 
a posteriori probabilities of pixel (j + 1) and the a posteriori probabilities 
of pixel j are used to update those of pixel j. Similarly, equation (35) is 
used to update the a posteriori probabilities of pixel j, using information 
from pixels to the left of pixel j. The a posteriori probabilities of 
pixels 1 and 2 are used to update those of pixel 2. The updated a posteriori 
probabilities of pixel 2 and the a posteriori probabilities of pixel 3 are 
used to update the a posteriori probabilities of pixel 3. The process is 
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repeated until the updated a posteriori probabilities of pixel (j - 1) and the 
previously updated ones of pixel j are used to update those of pixel j. 

4.2.3 APPLICATION Of SEQUENTIAL CONTEXT TO TWO-DIMENSIONAL NEIGHBORHOODS 

The expressions for the likelihood function and updating eouations become 
complex with the Increase in tne size of the local neighborhood. Hence, it is 
proposed to use sequentially the sequential context for two-dimensional local 
neighborhoods. It is desirable that the updating be independent of the 
sequence of the sequential neighborhoods in which the updating is done. From 
equation (45) it is seen that, with the use of sequential neighborhoods 
(centering on the pixel under consideration), the updating is independent of 
the sequence of the sequential neighborhoods in which the updating is done. 

The sequential neighborhoods to be used in updating, then, are the ones 
centering on the pixel under consideration in four directions: 0“, 45“, 90®, 

and 135®. A few typical two-dimensional local neighborhoods composed of these 
sequential neighborhoods are illustrated in figure 4-3. 
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5. £Xf'i'>i'^£WTAL RESULTS 

In this section, some results are obtained by applying the theory developed in 
the previous sections to the classification of the remotely sensed Landsat MSS 
data. Several segments^ were processed In the following manner. The Image 
was overlaid with a rectangular grid of 209 grid Intersections, and the labels 
of the pixels or dots corresponding to each grid Intersection were acquired. 
Two classes are 1n the Image: Class 1 Is wheat, and class 2 is nonwheat 

designated "other," A linear classifier Is trained on one-half of the labeled 
data. The remaining one-half of the labeled data 1s used as a test set. The 
a posteriori probabilities of the classes of the pixels are estimated by 
normalizing the discriminant function values of the classes. 

5.1 COMPUTATIONAL RESULTS FOR A TYPICAL 5-BY-5 NEIGHBORHOOD 

The a posteriori probabilities of the classes of the pixels In a typical 
5-by-5 neighborhood from an MSS Image of segment 1739 are listed in 
table 5-1. This segment Is In Teton County, Montana. 

TABLE 5-1.- THE A POSTERIORI PWBA8ILITIES OF THE CLASSES 
IN A 5-BY-5 NEIGHBORHOOD 

[The first entry is p(« ■ 1|X) and the second entry Is p(w ■ 2jX)3 


(0.716,0.284) 

(0.322,0.678) 

(0,820,0.180) 

(0.669,0.331) 

(0.326,0.674) 

(0.629,0.171) 

(0.899,0.101) 

(0.397,0.103) 

(0.762,0.238) 

(0.886,0.114) 

(0.625,0.375) 

(0.158,0.842) 

(0.285,0.715) 

(0.757,0.243) 

(0.117,0.883) 

(0.087,0.913) 

(0.062,0.938) 

(0 060,0.940) 

(0.080,0.920) 

(0.090,0.910) 

(0.125,0.875) 

(0.089,0.911) 

(0.132,0.868) 

(0.157,0.843) 

(0.127,0.873) 


^A segment Is a 9- by ll-kllometer (5- by 5-naut1ca1-m11e) area for which the 
MSS Image Is divided Into a 117-row by 196-column rectangular array of 
pixels. 






























Thi pixtl undtr considtratlon Is tht ctntral plxat of tr.e neighborhood. The 
a priori probabilities are estimated as an average of the a posteriori 
probabilities In the relghborhood. Consider the following. 

p((it ■ 1) ■ / p(m ■ 1.X)dX 

- / p(tt» - 1|X)p(X)dX 

• E Cp(« • 1|X)] 

P(X) 

• ^ Cp(w ■ ^ (^6) 

where Xj (J ■ 1,2,“>*,N) are the pixels In tne local neighborhood. The 
a posteriori probabilities of the classes of the pixel under consideration are 
updated using sequential context and the procedure described in section 4.2.3. 
This procedure is repeated for five Iterations, and the computational results 
are listed In table 5-2. 


TABLE 5-2.- COMPUTATIONAL RESULTS OF UPDATING THE A POSTERIORI 
PROBABILITIES OF THE CENTRAL PIXEL IN A 5-BY-5 NEIGHBORHOOD 
(USING THE LINEAR TRANSITION PROBABILITIES MODEL) 


Iter- 

ation 

A posteriori 
probabilities 
before 
updating 

A priori 
probabilities 
in the 

nel ghborhood 

Estimates of parameter 9 
for different sequential 
nel ghborhoods 

A posteriori 
probabilities 
after 
updating 

0* 

45* 

90* 

135* 

1 

(0.285,0.715) 

(0.4087,0.5913) 

0.0 

0.2904 

0.4 

0.4 

(0.3574,0.6426) 

2 

(0.3574,0.6425) 

(0.4130,0.5870) 

0.0 

0.2655 

0.4 

0.4 

(0.4315,0.5685) 

3 

(0.4315,0.5585) 

(0.4173,0.5827) 

0.0 

0.2416 

0.4 

0.4 

(0.5034,0.4966) 

mm 

(0.5304,0.4956) 

(0.4216,0.5784) 

0.0 

0.2194 

0.4 

0.4 

(0.5599,0.4301) 


(0.5599.0.4301) 

(0.4255,0.5745) 

0.3 I 0.1995 

0.5 

0.4 

(0.6363,0.3637) 
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Th« trut class of tht central pixel Is wheat; and, without using the 
contextual Information, the central pixel will be misclassifled into class 
"other." Table 5-2 shews that, using contextual information from the local 
neighborhood, after the third Iteration the central pixel Is correctly 
classified. 

5.2 CONTEXTUAL OASSIFICHTIOH RESULTS 

Comparative results with and without using contextual Information In classifi- 
cation are presented In this section. Classification maps for segment 1739 
are shown In figures 5-1 through 5-3. It Is observed from the Independent 
test stt that the classification accuracy for this segment Increased by 5 per- 
cent with the use of contextual Information from the 3-by-3 neighborhood and 
by 7 percent with the use of contextual Information frwn the 5-by-5 neighbor- 
hood (over the accuracies obtained ..'Ithout using contextual Information) . 

While generally presir/lng the boundaries, contextual classification corrected 
the misclassificatlons of many pixels and did this more accurately with data 
from the 5-by-5 neighborhood than with data from the 3-by-3 neighborhood. 

Accuracies In the classification of MSS images of a few segments with and 
without the use of contextual information are listen In table 5-3. 

In general, an examination of the classification maps of full images and 
classification accuracies on the independent test set shows considerable 
Improvemont In the classifications with the use of contextual information. 

The improvement is greater with the increase in sire of the neighborhood. The 
contextual classification of a full segment with a 5-by-5 neighborhood using 
the methods developed here took approximately 12 minutes of total time on the 
Purdue University Laboratory for Applications of Remote Sensing (LARS) IBM 3031 
computer system. 
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TABLE 5-3.- CLASSIFICATION ACCURACIES (PERCENTAGES) 
WITH AND WITHOUT CONTEXTUAL INFORMATION 


St9Mnt 


Location 
(county, state) 


without 

context 


with sequential context 
(a)‘“ 


NS • 5 


NS • A 


NS ■ 3 


with 

spatially 
uniform context 


>^1005 


“^lOSO 


Cheyenri, 

Colorado 


85.88 


Sherman, 

Texas 


80.77 


88.46 


88.46 


90.38 


85.58 


82.69 


81.73 


86.54 


81.73 


•>1231 


Jackson, 

(^lahoma 


89.42 


91.35 


91. T5 


90.38 


91.35 


^1520 


®1604 


=1675 


Big Stone, 
Minnesota 


84.62 


87.50 


85.58 


86.54 


Renville, 
North Dakota 


60.58 


63.46 


60.58 


59.62 


McPherson, 
South Dakota 


68.27 


71.15 


73.08 


68.27 


34.62 


60.58 


67.31 


=1737 


Teton, 

Montana 


68.27 


75.00 


72.22 


73.08 


70.19 


*NS ■ Neighborhood size. 

“Segment? in whicn class 1 is winter wheat. 
^Segments In which class 1 1s spring wheat. 


^3 























>y.ire b-1.- Classificalion map of se.jmenL 1/39 without contextual information. 
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I I Tht variance reduction factors obtained without using contextual information 

i and with the use of contextual Information from a local neighborhood of size 5 

I are listed in table S-4. 
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TABLE 5-4.- VARIANCE REDUCTION FACTORS WITH AND 
WITHOUT CONTEXTUAL INFORMATION 


Segment 

Location 
(county, state) 

Variance reduction factor 

Without 

context 

With sequential 
context, NS « 5 

1005 

Cheyenne, 

Colorado 

0.5720 

0.5430 

1060 

Sherman, 
Texas ' 

.6227 

.4717 

1231 

Jackson, 

Oklahoma 

H 

.4173 

1520 

Big Stone, 
Mi nnesota 

mi 


1604 

Renville, 
North Dakota 

3S6S 

.9741 

1675 

McPherson, 
South Dakota 

.9985 

.9248 

1739 

Teton, 

Montana 

.9271 

.8267 


[ Table 5-4 shows that there is a consistent improvement in the variance 

I reduction factor with the use of contextual information in classification. 
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6. CONCLUSIONS 




In this paper, the problem of Incorporating contextual or spatial infomation 
Into the classification of Imagery data Is considered. The contextual infor- 
mation Is Introduced Into classification based on the spatial dependencies 
between the states of nature of neighboring pixels or based on transition 
probabilities. The dependencies between neighboring patterns are modeled with 
linear and nonlinear models through a single parameter 0, which describes the 
transition probabilities of the classes of the neighboring patterns. An 
expression is developed for the likelihood function cf the pattern vectors 
from a general local neighborhood under the following reasonable assumptions: 
(d) The probability density function of a pattern, given its label, is inde- 
pendent of other patterns and their labels; and(b) the labels of the pattern 
vectors are Independent of the labels of their nonneighbors. Specific expres- 
sions for the likelihood function are derived for different local neighbor- 
hoods and with different transition probabilities models. The parameter 9 is 
estimated as the one that maximizes the likelihood function. 

Expressions are presented for updating the a posteriori probabilities of the 
classes of a pixel using information from a single neighbor. "It is shown that 
these expressions can be used to update the a posteriori probabilities of a 
pixel under consideration for spatially uniform context and in a general 
sequential neighborhood. The contextual information from two-dimensional 
neighborhoods is introduced into the classification of imagery data, also, 
through a sequence of sequential neighborhoods. 

The techniques presented here are applied to the classification of remotely 
sensed MSS imagery data. Computational results for a typical 5-by-5 neighbor- 
hood are presented. The classification maps are presented with and without 
context, and classification accuracies are given for different sizes of local 
neighborhoods. 









For a two-class, three-sequential -neighborhood case, expressions are developed 
for obtaining the transition probabilities without using models. Instead of 
using one parameter e in the local neighborhood of the pattern under consider- 
ation, as shown 1n appendix C, transition probabilities models with different 
parameters In different directions can be used. The techniques, as discussed 
in appendix 0, can be used for multitemporal or time-varying situations such 
as those encountered in remote sensing. 
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APPENDIX A 


A GENERALIZATION OF SPATIALLY' UNIFORM CONTEXT 
TO LARGE NEIGHBORHOODS 




APPENDIX A 





A GENERALIZATION OF SPATIALLY UNIFORM CONTEXT 
TO LARGE NEIGHBORHOODS 


In this appendix, the contextual relationships developed 1n section 3.2 for 
spatially uniform context are extended for larger neighborhoods. In 
particular, the neighborhood shown in figure A-1 Is considered. The pixels 
with the common sides are treated as neighbors, and the diagonal neighbors of 
the pixels are treated as nonneighbors. 



Figure A-1.- Neighboring pixels In a 3-by-3 local nel ghborhood . 


The a posteriori probabilities of the labels of pixel 0, given the Information 
from its local neighborhood, can be written as 




p(«0 ’ 


“ 7 — 

(A-1) 

where 
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(A-2) 

and 



f « p(XQ,Xj.-*,Xg) 


(A- 3) 




The notation of equation (A-4) Is used In the remainder of this appendix. 

P ( 1 o * h '*“'^ 8 ^ “ '*^“0 ■ ^ 0*“1 " h '“‘*"8 * ^ 8 ^ 

Using equations (13) and (A-4) 1n equations (A-2) and (A-3), fi(1o) ^ can 
be written as 


PSjW ^ ... I [ft ^< V .!jL V .[p<. 

"o' fi“o- 'o> if:i 91 U" ■ ’j' J 




and 


• t h 




dn) 


(A-S) 

(A-6) 


It was assumed that the labels of the pixels are independent the labels of 
their nonneighboring pixels, with the neighboring pixels defined as In 
figure A-1. Now consider 

P(^0*h*‘***^8^ “ P(1o)P(^l’*•*•^3!^l■^■ 

■ P(1o)'*di 

P(l7|1o.ig)P(1g|1o) (A-7) 

The second term in the right-hand side of equation (A-7) can be written as 
P(l2^|1g»l2»'**dg) ■ P(^|_dgd2»^3) 

P(iQ|1^,i2,ig)P(ipi2lig) 

f*(iol'' 2 ’V^^^ 2 i'' 8 ^ 

pnppnp 



similar to equation (A-8), the other terms of equation (A-7) can be shown to 
be the following. 
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Using equations (A-8) and (A-9) In equation (A-7) results in 
IPdolCPdjMtlPOali?)— 

pdo.h.— .ig) 


(A-10) 


Expressing the transition probabilities in equation (A-10) in terms of the 
parameters [equation (1) or equation (11)], equations (A-l), (A-5), and (A-6) 
can be used to incorporate the contextual information from the local 
neighborhood. As in section 3, a can be obtained by maximizing the likelihood 
function of the spectral values of the pixels 0, 1, 2, 3. 


Equation (A-10) also can be used to estimate the transition probabilities in 
the local neighborhood, if the labels of the pixels are known. For example: 
In remote sensing, for a selected set of images, the labels of the pixels or 
ground truth are known. Often it is necessary to estimate the transitior. 
probabilities. The following example illustrates, for a few typical 
neighborhoods, the transition probabilities obtained from the maximization of 
equation (A-10). The a priori probaai 1 ities in the local neighborhood are 
estimated as an average of the a posteriori probabilities of the classes. 
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Examolt; This txample lllustratas the transition probabilities obtained by 
computating 9, which maxlmlres equation (A-10). These are listed for a few 
typical neighborhoods In table A-1. For neighboring pixels A and B, the 
notation used for the transition probabilities listed in table A-1 is shown In 
figure A-2, 


TABLE A-l.- MAXIMUM LIKELIHOOD ESTIMATES OF TRANSITION PROBABILITIES 
FOR SOME TYPICAL NEIGHBORHOODS 



r 

A priori 

prebabi 

nitlts 

P(- 

• 1 ) 

■ 0.6667 

P(« 

• 2 ) 

■ 0.3333 


• 1) 

> 0.4444 

P(« 

■ 2 ) 

■ 0.5556 


Traniltlon probablUtlts 



?{m • 1 ) • 0.2222 

?{m ■ 2) • 0.7778 


e - 0.25 


» • 0.35 
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APPENDIX B 

OPTIMAL TRANSITION PROBABILITIES FOR A TWO-CLASS, 

3-BY-3 NEIGHBORHOOD CASE 

In general, the transition probabilities that maximize the likelihood function 
can be obtained using optimization methods such as the Oavidon-Fletcher-Powel 1 
procedure. This requires searching for M * M parameters, where M is the 
number of classes. Using the transition probabilities models of section 2, 
the likelihood function is expressed as a function of a single parameter 9. 
However, for a two-class, 3-by-3 neighborhood case, expressions for the 
transition probabilities, which maximize the likelihood function without using 
models for transition probabilities, are obtained in the following manner. 

Let A and B be the neighboring pixels. Let there be two classes. Then we 
have the following theorems. 

Theorem 1: For a two-class case, if the a priori probabilities are position 

Independent in the local neighborhood Ci.e.. ’ i) * ® i)3» Then the 

transition probabilities are symmetric. That is, 

P(u^ * l|(jig » 2) » P(uig » l|ai^ = 2) ' (B-1) 

Proof: Consider 

P(u^ * l|(Ug * 2) =■ 1 - P(a»^ * 2l(Og » 2) 

PK - 2) 

■ ‘ ^ ;) '*<“6 • 2 '“a • 2 ) 

» 1 - [1 - P(u)g » l|u>^ * 2)] 

=» P(u»g * l|u^ » 2) 


Theorem 2: Let 9^ ■ P(<**A * ®2 * ^^“A * ^l'**B “ The 

transition probabilities and the a priori probabilities are related as 


' hi ’ 






A 


i 

A 


(B-2) 


where ■ P(ui^ « 1) « P(«g * i). 

Proof: Using the Bayes theorem, we obtain 

P(c». « 1) 

P(U»^ - l|0»g « 2) » TTJ- --." ' g J P(«l»g » 2|u.^ » 1) 

8 

p, 

is, (1 . e^) . ^ (1 - 8j) 


Theoren.? 1 and 2 are used in the following to obtain and 92* The 
likelihood function L(9]^,02) for the three-pixel sequential neighborhood of 
figure 3-4 can be expressed in terms of Sj and 93 as 

■ »l^m ^ ‘ (' - - »2h21 

*• - *l>*2n22 * - «2>^211 * (1 - «2)<‘ - «1>^2I2 

♦ 9j(l - 9,)aj2i * ®l*222 (B-3) 


where a-.|j)( are given by 


P(J|X.)p(k!Xg) 

‘ijk ’ PTw » j")P(<u At 


(B-4) 


and i, j, and k take values 1 or 2. From equations (8-2) and (3-3), the 
likelihood function can be expressed in terms of parameter 9j^ as 


L(e^) =» -t- b29j + bg 


where 


(B-5) 
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Let vj be the value of obtained by differentiating equation (B-5) with 
respect to and equating the resulting expression to zero. That is. 


V 



(B-7) 


The parameters Sj, and 93 should lie in the interval 0 to 1 . Let ^2 and V 3 be 
the end points O"** 9j^. They are given by 


and 


"2“ 1 


Vj a 0, if pi < 1; Otherwise, = 
2 



(B- 8 ) 


Now, the optimal 9^ and, Sj can be obtained as follows. If 0 < < 1, choose 

the optimal value of 9j^, that equals the value vj, ^ 2 , or V 3 and gives 

the largest value of L(9iQp^). !f vj lies outside the interval 0 to 1 , choose 
the value for that equals the value v 2 -or. V 3 and gives the largest value 

'-{fliopt); 92opt computed from equation (&- 2 ). 


Example : For a few typical sequential neighborhoods, this example illustrates 
the transition probabilities computed using the linear and nonlinear models of 
section 2, using the procedure of this appendix and using the Oavidon-Fletwher- 
Powell optimization technique. The a posteriori probabilities of the classes 
are of the 3-by-3 local neighborhood of dot 89 from segment 1739, Teton County, 
Montana. These are obtained by normalizing the outputs of a linear classi- 
fier. The four-sequential neighborhoods are neighborhoods in four directions: 
0®, 45®, 90®, and 135®, centering on the central pixel. The a priori proba- 
bilities are computed as the average of the a posteriori probabilities in the 
neighborhood. Class 1 is wheat and class 2 is "other." The a priori 
probabil ities computed for this 3-by-3 neighborhood are 


P((i» - 1) - 0.5531 
P(u» « 2) - 0.4469 


(B-9) 


The estimated transition probabilities are listed in table B-1. 





































































Table B-1 shows that the estimated transition probabilities agree well 
different procedures. With linear models, the parameter 9 tends to be 
for mixed neighborhoods, thus ignoring spatial information from mixed 
neighborhoods. 
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APPENDIX C 


ESTIMATION OF TRANSITION PROBABILITIES WITH DIFFERENT 
PARAMETERS IN THE LOCAL NEIGHBORHOOD 


In this appendix, some results are developed for estimating the transition 
probabilities with different parameters in different directions in the local 
neighborhood and with Interactions in the parameters. The local neighborhood 
considered is shown in figure 3-1. The linear model of equation (1) is used 
for the transition probabilities. Let 9>^ and be the parameters of the 
transition probabilities model for horizontal and vertical neighbors, 
respectively. For the local neighborhood illustrated in figure 3-1, consider 
the following equation from section 3.2: 


LCS") - L(ej^,ay) 


P(Xq,Xj,...,X4) 
? 

rr p(xj 

j-O ^ 


- £ p(<*i » igiXg) ( n (1 - 9y) 9 




p{u » ig|X.) 

V p(u) » in) 


4 r 

n (1- 

L 


p(u » fglX.) 


^ ®H TTSnrr 


(1 ” 9 y) (1 " 9|^) + (1 - Sy) (1 - ■*•(!- 9y) 9^3|^ 

+ (1 - 9y)9y(l - 9^) Oy + (1 - 9y )0y ( 1 - 9j^)6^0yj^ + (1 - 9y ) 9y 9^ SSyj^ 
+ 9y(l - 9^) By ^ 9y{l - 9g)9|^aS^y + 9y9^3y^ fC- 


^ P(« ■ 

‘H • A, ' F TS " ' * ' ! ' :) " \ 


ig.l • ^0 




m 


i 


M p(u» ■ 

£. - ° --f [p(» ■ 1|,|112)P(« ■ <01^)3 


" p‘{« • ij) 




M p{« • 1«lX«) 

. T 2, ? ‘ 
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1 q-1 P^(u » 1q) 

+ P(« ■ I0IX4)] 

}L P(-^ ■ iolXp) 

i«*l P^(u » in) 


Cp(u » IqIX^) + p(<*> • io|X3)]Cp(to » ig|X2) 


Cp(ti» » 1 q1X^)p(u « iQ|X3)][p(u) » 1 q|X2)p(w » ig|X^)] 


M p(u) ■ 1«jX«) 

o 6 y^ - ^ —3 Cp(u ■ * ioiX^)][p(uj » 1g|X^)p(ui * igiXg)] 

1g»l P (ui - ig) 

M p(o» « 1n|Xn) 

<x6y^ • 2 ] “3 Cp(m - iglXj) + p(o. » iQ|X3)][p(ui « iQlX2)p(w > iglX^)] 

ig»L P (uj - ig) 

To determine dy and 9^ that maxii.ize equation (C-1), one takes partial deriva- 
tives of equation (C-1) with respect to 9y and 9^ and solves the resulting 
equations for 9y and 9j^. Taking the partial derivative of eouation (C-1) with 
respect to 9y, equating the resulting expression to zero, and solving for 9y, 
one obtains 


1 1 \ *N2®H ^ ®N1®H * ^NO 

®V ['ll 2 

^D2®H " ^01®H ®D0 


(C-2) 



where 
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r 




( 


d^2 * 2 - Zsh ^ 2Bf^ - 

«N1 • * 2b„ + 2»Y - «VH 

a^O • 2 - ay 

ajj2 • 1 - «H ®H " ®V ^ ®VH “ “®VH ®V " *®HV * ®VH 
*01 • -2 + «H ^ 2ay - Oy^ - 2Sy + Ogj^y 


8j,q - 1 - tty + 9y 

Similarly, taking the partial derivative of equation (C-1) with respect to 9^, 
equating the resulting expression to zero, and solving for 9^, one obtains 
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/1\ ^N2^V ^ ‘^Nl^V * ‘^NO 
^ ^02®V ^D1®V' ^00 


where 

‘’N2 • 2 - a„ - 2oy ^ ^ 2Sy - aB^^y 

b^l . -4 + 2«„ ^ 2«y - ay„ 

‘>N0 ’ 2 - *H 

bo2 - 1 - <*H * ®H * ®V ^ ®VH • *®VH * ®V * ^ ®VH 

•jjjl - -2 + 2o„ - 28^^ + ay - ay„ + 2Sy„ 

*500 “ ^ ■ “H ®H 


(C-3) 


Substituting the expression for 9y from equation (C-2) into equation (C-3) 
results in a fifth-order algebraic equation, the roots of which can be 
obtained by numerical methods (refs. 15, 16). Let the resulting roots be 
9Hr(i); i ■ 1» 2, •••, 5. From equation (C-2), corresponding values are 
obtained for9y^(i); 1 * 1, 2, •••, 5. 
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where 7f(1) 1s a vector. Let 
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; 1 ■ 1,2, •••,5 
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(C-5) 


Now, dgp^ for 0 <P|^ < 1 and 0 < 9y < 1, which maximizes equation (C-1), can 
be obtained using a procedure similar to that given 1n the flow diagram of 
figure 3-7. The above analysis can be generalized with different parameters 
for more than two directions and for larger neighborhoods to obtain the 
transition probabilities. 
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APPENDIX 0 


MULTITEMPORAL INTERPRETATION OF CONTEXT 

In this appendix, a nuUltemporal Interpretation of the theory developed In 
the paper is given for applications such as those In the machine processing of 
remotely sensed Imagery data. In remote sensing, the sensor system usually 
makes several passes over the same ground area and acquires a set of data for 
each pass or acquisition. The data from these passes are registered, and the 
classification is performed on the registered data. Let there be r acqulsl- 

I tions. For every pixel 1n acquisition 1, a data vector X. (i ■ i, 2, r) 

I 

t Is acquired. Supoose that acquisitions 2, r are registered with respect 

i to acquisition 1. There will be variations In the data of each pixel from 

f acquisition to acquisition. Also, errors are encountered 1n registration. 

S 

\ Let the classifier be trained on the data representative of the Individual 

acquisitions, obtalr-'ng the probability density functions p(X|u « 1), 

1 ■ 1, 2, «•*, M, for each acquisition. This appendix presents the applica- 
tion of the theory of sequential context developed In the paper for the clas- 
sification of the pixel under consideration. Let X^ be the spectral vector of 
the pixel under consideration In acquisition 1 (1 ■ 1, 2, •••*, r) In the clas- 
sification of the pixel under consideration. This approach takes into account 
the registration errors and the variations in the data from acculsitlon to 
acquisition. The pixel Is classified using the decision rule: Classify It to 

class u ■ j. If 

p(“ ■ j ,X^) > p(« ■ 1 !Xj, ,••• ; 1 ■ 1,2, (0-1) 

1 * j 

The dependencies from acquisition to acquisition can be modeled through the 
models of section 2; the transition probabilities can then be estimated using 
the techniques developed in section 3; the a posteriori probabilities of the 
classes of the pixel, using data from a'l the acquisitions, can be computed 
using the techniques developed In section and the pixel can be classified 
using equation (0-1). 

<!T9 


If thtrt ar® no errors In registering the data from acquisition to 
acquisition, the transition probabilities satisfy the following relation. 

P(«n • •'(“‘n-i ■ 1) ■ 1 If 1 ■ k 

• 0 If 1 • k (D-2) 

where 1$ the class of the pixel under consideration from the n^^ acqui- 
sition. Using sequential context [equation (45)], the a posteriori 
probabilities of the classes of the pixel under consideration can be written 
In terms of the pixel spectral vectors fry.i each acquisition as follows. 
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Ni;(l^) 


(0-3) 




where 
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1^-1 

From equations (0-2) through (0-4), equation (D-5) Is obtained. 
p(«^ ■ 

P(“l ■ 1^)p(Xj!«i ■ 1,.)p(X2l“2 " 

M 

P(<tt| ■ 1^)p(Xj - 1^)p(X2l»2 * * V) 


(0-4) 


(0-5) 


Thus, use of sequential context with assumptions (a) and (b) of section 4 and 
of equation (D-2) in the classification of a pixel in a multltemporal situa- 
tion amounts to the class-conditional independence of the pixe’ spectral 
vectors of each acquisition. Equation (0-5) can also be written as follows. 



X"<“M • <j I *l'—.»j.l )!>!■'; |“J • 'j: 
'j ‘ 

for J ■ 1 ,2,**» ,r 

P(wj ■ 1^)p(X^!w^ « 1j) 

- IjlXj) - -H- 

2 P(«i ■ ii)p(x,!(ii, ■ 1,) 

1,-I i i 1 i i 


(D-6) 


(0-7) 


Equations (0-6) and (0-7) can be Interpreted as follows: when the first 

acquisition Is acquired, the a priori knowledge P(u^ • ij^) about the classes 
I I of the pixel under consideration Is modified Into a posteriori probabilities 

I according to equation (0-7). These will become the a priori knowledge for the 

next acquisition. With the use of the observed spectral vector, the a priori 
knowledge Is modified Into a posteriori probabilities according to 
equation (0-6). When no registration errors are present, equation (0-6) can 
be used sequentially In a multltemporal situation to incorporate the 
contextual Information in the classification of th-j pixel under consideration. 

However, using the techniques developed in the paper, this multitemworal 
Interpretation can be easily couoled with the spatial information fr:.? two- 
dimensional neighborhoods. 
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